System and method for displaying and analyzing financial correlation data

ABSTRACT

A method for displaying a matrix of correlations or other statistical measures of co-movement associated with a plurality of financial instruments, portfolios, indices, or asset classes is disclosed. The method includes: converting the matrix of correlations or other co-movement measures into a probability transition matrix; defining a corresponding abstract distance measurement between any two of the plurality of financial instruments, portfolios, indices, or asset classes based on the probability transition matrix; assigning coordinates in a Euclidean space to each of the plurality of financial instruments, portfolios, indices, or asset classes, wherein a Euclidean distance between any two financial instruments, portfolios, indices, or asset classes in the Euclidean space corresponds to the corresponding abstract distance measurement; and displaying on a display device the plurality of financial instruments, portfolios, indices, or asset classes based on more significant dimensions of the Euclidean space.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application is a continuation of U.S. patent application Ser. No.14/751,093, filed on Jun. 25, 2015, which is a continuation of U.S.patent application Ser. No. 14/102,234, filed on Dec. 10, 2013, now U.S.Pat. No. 9,098,877, which is a continuation of U.S. patent applicationSer. No. 13/754,816, filed Jan. 30, 2013, now U.S. Pat. No. 8,629,872,the content of all of which are incorporated herein by reference.

FIELD

Aspects of embodiments of the present invention relate to systems andmethods of displaying and analyzing correlation data and otherstatistical measures of co-movement for financial assets and portfolios.

BACKGROUND

Stocks and other securities and financial instruments are frequentlyarranged in portfolios or other collections containing numerousdifferent financial assets. A problem in portfolio management isunderstanding the co-movement of different assets or asset classes, andthe implications for portfolio construction and risk management.According to one embodiment, co-movement is the correlation of assetprices or valuations over time (for example, which stocks tend to riseor drop in value as a group). Diversifying such portfolios (such asincluding assets whose financial behavior tends to be independent overtime as opposed to being highly correlated) is one of several importantfinancial investment functions.

Analyzing the correlation of a small number (such as six or fewer) ofdifferent stocks may be relatively straightforward. For example, one maydirectly examine the matrix of correlations or co-movement indicators(i.e., an N×N matrix for N assets) since such a matrix has relativelyfew distinct entries (at most a few dozen for N≤6). However, whenanalyzing a large number of assets (for example, 100 or 500 suchassets), the number of combinations of any two of them growsquadratically and quickly overwhelms any attempt by an investor to graspthe structure of the correlation matrix as a whole, or to derive salientcharacteristics from it for investment purposes.

SUMMARY

Embodiments of the present invention are directed toward systems andmethods of displaying and analyzing financial correlation data. Furtherembodiments are directed toward displaying financial correlation data oflarge numbers of assets in meaningful graphical depictions that reducethe underlying complexity of the numerous interrelationships, thusmaking them significantly simpler to appreciate. Still furtherembodiments are directed to analyzing the displayed correlation data(for example, measuring overall portfolio concentration).

In an exemplary embodiment, a system and method for constructing athree-dimensional (3-D) scatter diagram for displaying on a displaydevice is provided. In the scatter diagram, each of the pointsrepresents an asset. Correlation between any two assets is representedby the distance between their corresponding points in the scatterdiagram, with highly correlated assets being close to each other in thescatter diagram, weakly correlated assets being far apart, and thedegree of correlation being inversely proportional to the distancebetween the corresponding points. Such a scatter diagram permits visualanalysis of assets and portfolios to identify concentrations of risk,including risk concentrations that might otherwise go unnoticed.

According to an exemplary embodiment of the present invention, a methodfor displaying a matrix of correlations or other statistical measures ofco-movement associated with a plurality of financial instruments,portfolios, indices, or asset classes is provided. The method includes:converting the matrix of correlations or other co-movement measures intoa probability transition matrix; defining a corresponding abstractdistance measurement between any two of the plurality of financialinstruments, portfolios, indices, or asset classes based on theprobability transition matrix; assigning coordinates in a Euclideanspace to each of the plurality of financial instruments, portfolios,indices, or asset classes, wherein a Euclidean distance between any twofinancial instruments, portfolios, indices, or asset classes in theEuclidean space corresponds to the corresponding abstract distancemeasurement; and displaying on a display device the plurality offinancial instruments, portfolios, indices, or asset classes based onmore significant dimensions of the Euclidean space.

A number of the more significant dimensions may be three.

The more significant dimensions may include three of the mostsignificant dimensions.

The displaying of the financial instruments, portfolios, indices, orasset classes may include displaying an identifying label for each ofthe financial instruments, portfolios, indices, or asset classes in a3-dimensional Euclidean representation on the display device.

The method may further include modifying the 3-dimensional Euclideanrepresentation on the display device in response to a user command.

The method may further include displaying successive representations ofcorrelation data or other statistical measures of co-movement asobserved on successive dates.

The method may further include adjusting a color or size of theidentifying label to correspond to a respective value of an additionalnumerical characteristic being displayed in the 3-dimensional Euclideanrepresentation on the display device for each of the financialinstruments, portfolios, indices, or asset classes.

A number of the more significant dimensions may be two.

The more significant dimensions may include two of the most significantdimensions.

The displaying of the financial instruments, portfolios, indices, orasset classes may include displaying an identifying label for each ofthe financial instruments, portfolios, indices, or asset classes in a2-dimensional Euclidean representation on the display device.

The method may further include generating a measure of diversificationof the financial instruments, portfolios, indices, or asset classes.

The generating of the measure of diversification of the financialinstruments, portfolios, indices, or asset classes may includegenerating the measure of diversification using the more significantdimensions of the Euclidean space.

The measure of diversification may include a global concentration, arelative global concentration, or a largest local concentration.

The measure of diversification may include a global concentration. Thegenerating of the global concentration may include: assigning a weightto each of the financial instruments, portfolios, indices, or assetclasses; and weighting a contribution of each of the financialinstruments, portfolios, indices, or asset classes by its respectivesaid weight in the global concentration.

The method may further include comprising generating a portfoliodiversification measure by: identifying ones of the financialinstruments, portfolios, indices, or asset classes; assigning secondweights to respective said ones of the financial instruments,portfolios, indices, or asset classes; and generating the globalconcentration by only using the ones of the financial instruments,portfolios, indices, or asset classes in place of each of the financialinstruments, portfolios, indices, or asset classes, and using the secondweights in place of the weight of each of the financial instruments,portfolios, indices, or asset classes.

The method may further include generating a sequence of successivelyless significant local concentrations of the financial instruments,portfolios, indices, or asset classes.

The method may further include generating a plurality of relative localconcentrations of the Euclidean space.

The method may further include generating a numerical summary measure ofaccuracy with which the Euclidean distance as measured in the moresignificant dimensions of the Euclidean space represents thecorresponding abstract distance measurement.

The method may further comprising changing a sign of one of thecoordinates for improving consistency of the displaying of the financialinstruments, portfolios, indices, or asset classes over a period oftime.

The method may further include re-ordering the coordinates for improvingconsistency of the displaying of the financial instruments, portfolios,indices, or asset classes over a period of time.

The financial instruments may include publicly traded equity securities,publicly traded fixed income securities, publicly available mutualfunds, exchange-traded funds, publicly traded currencies,exchange-traded futures, or options on exchange-traded futures.

The method may further include: in response to the displaying on thedisplay device, receiving a user command to modify an attribute for aselected one of the plurality of financial instruments, portfolios,indices, or asset classes; and modifying the attribute in response tothe user command.

The attribute may correspond to an investment amount.

The correlations may correspond to financial returns.

According to another exemplary embodiment of the present invention, asystem for displaying a matrix of correlations or other statisticalmeasures of co-movement associated with a plurality of financialinstruments, portfolios, indices, or asset classes is provided. Thesystem includes a processor, a display device coupled to the processor,and a nonvolatile storage device coupled to the processor and storinginstructions. The instructions, when executed by the processor, causethe processor to: convert the matrix of correlations or otherco-movement measures into a probability transition matrix; define acorresponding abstract distance measurement between any two of theplurality of financial instruments, portfolios, indices, or assetclasses based on the probability transition matrix; assign coordinatesin a Euclidean space to each of the plurality of financial instruments,portfolios, indices, or asset classes, wherein a Euclidean distancebetween any two financial instruments, portfolios, indices, or assetclasses in the Euclidean space corresponds to the corresponding abstractdistance measurement; and display on the display device the plurality offinancial instruments, portfolios, indices, or asset classes based onmore significant dimensions of the Euclidean space.

A number of the more significant dimensions may be three.

The instructions, when executed by the processor, may further cause theprocessor to control the display device to display the financialinstruments, portfolios, indices, or asset classes by displaying anidentifying label for each of the financial instruments, portfolios,indices, or asset classes in a 3-dimensional Euclidean representation onthe display device.

The processor may be further configured to receive a user command. Theinstructions, when executed by the processor, may further cause theprocessor to modify the 3-dimensional Euclidean representation on thedisplay device in response to the user command.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withcolor drawing(s) will be provided by the Office upon request and paymentof the necessary fee.

The accompanying drawings, together with the specification, illustrateexemplary embodiments of the present invention. These drawings, togetherwith the description, serve to better explain aspects and principles ofthe present invention.

FIG. 1 illustrates an exemplary computer system for collecting,generating, remapping, displaying, and analyzing correlation data offinancial assets according to an embodiment of the present invention.

FIG. 2 shows an exemplary 3-D plot of different financial assetsaccording to an embodiment of the present invention.

FIGS. 3-6 are exemplary screen shots of three dimensional (3-D) scatterdiagrams of financial asset correlations according to an embodiment ofthe present invention.

FIG. 7 shows an exemplary method of displaying a matrix of correlationsfor numerous financial assets according to an embodiment of the presentinvention.

FIG. 8 shows an exemplary method of creating a scatter diagram accordingto an embodiment of the present invention.

FIG. 9 shows an example of the scatter diagram method of FIG. 8 beingapplied to a small number of assets according to an embodiment of thepresent invention.

FIGS. 10-12 show exemplary 2-D and 3-D scatter diagrams of S&P 100return data according to an embodiment of present invention.

FIG. 13 shows an exemplary 3-D scatter diagram of S&P 500 return dataaccumulated over three different periods according to an embodiment ofthe present invention.

FIG. 14 shows an exemplary method of generating a single summary measureof the diversification of a group of assets in a portfolio according toan embodiment of the present invention.

FIG. 15 shows an exemplary 3-D scatter diagram of S&P 500 return datahighlighting five local concentrations according to an embodiment of thepresent invention.

FIG. 16 shows an exemplary 3-D scatter diagram of S&P 500 return datazooming in on the largest local concentration in FIG. 15 according to anembodiment of the present invention.

FIG. 17 shows an exemplary 3-D scatter diagram of individual sectors ofthe U.S. bond market according to an embodiment of the presentinvention.

DETAILED DESCRIPTION

Exemplary embodiments of the present invention will now be describedwith reference to the accompanying drawings. In the embodiments, forease of description, financial assets or instruments (for example,stocks, bonds, currencies, etc.) are discussed and referred to simply as“assets” or even more simply as “stocks.” Collections of such assets(such as indices or weighted combinations of the assets) are referred toas “portfolios.” It is to be understood, however, that as used in thisdisclosure, an asset (or stock) refers to any financial instrument (suchas a publicly traded equity security, fixed income security, mutualfund, exchange-traded fund, currency, futures or options onexchange-traded futures, etc.) as well as collections of these assetsinto portfolios, indices, or asset classes.

FIG. 1 illustrates an exemplary computer system for collecting,generating, remapping, displaying, and analyzing correlation data offinancial assets according to an embodiment of the present invention.

In FIG. 1, the computer system 100 includes a computer 110 (or computerdevice), a display device 120 (such as a laptop computer), an optionalcontroller 125 (such as a gaming controller), and a nonvolatile storagedevice 130. The computer 110 may be, for example, a server computer, apersonal computer, or any such computing device including a processorfor executing machine instructions and a memory for accessing data (suchas numbers or calculations derived from numbers) and machineinstructions (such as code for programs intended to be run on thecomputer system 100).

The display device 120 is configured to display images (such as fromprocessed data or other visual depictions) as directed by the computer110. To this end, the computer 110 may include a graphics processor torender specialized graphics, such as three-dimensional (3-D) images,that may be manipulated (for example, rotated or zoomed, such as by auser) in various ways. The display device 120 may be, for example, aflat screen display device, or a laptop computer (as illustrated in FIG.1). The display device 120 may be a 2-D display device capable ofrendering views of 3-D images. The display device 120 may also be a 3-Ddisplay device. The laptop computer may be equipped with the graphicsprocessor in place of, or in addition to, the computer 110. There mayalso be a separate controller 125, such as a gaming controller (asillustrated in FIG. 1) to assist with display interaction between a userand the computer 110 and/or display device 120. In one embodiment, thecontroller 125 assists the user in rotating, zooming, and panning a 3-Dimage displayed on the display device 120.

The nonvolatile storage 130 may be, for example, a disk drive foraccessing and storing data over time with the computer 110. Forinstance, the nonvolatile storage 130 may be used to store a database offinancial information, such as asset values, or program code (such ascomputer instructions or modules to be loaded and run on the computer110 to perform embodiments of the invention). The computer 110 may alsobe connected to a network 140—such as a local area network (LAN), a widearea network (WAN), and/or a public WAN such as the Internet—forcommunication with external sources of information, data, and resources.

According to one embodiment, the memory of the computer 110 (and/or thedisplay device 120, when the display device is implemented as, forexample, a laptop computer) may store one or more modules for performingvarious tasks. A collection module 150 may be configured to collect rawfinancial data based on historical returns and/or generate projectedfinancial data based on predictive methods such as modeling orextrapolating. A correlation module 160 may be configured for generatingcorrelation data of financial assets from the financial data produced bythe collection module 150. A remapping module 170 may be configured toremap the correlation data produced by the correlation module 160 into aform more amenable to display and analysis. A display module 180 may beconfigured for displaying the remapped correlation data generated by theremapping module 170 on a display device (such as in a 3-D scatterdiagram). An analysis module 190 may be configured to further processthe remapped correlation data generated by the remapping module 170 (forexample, to determine an overall portfolio measurement of concentrationor diversification). Throughout this discussion, the foregoing modulesare assumed to be separate functional units, but those skilled in theart will recognize that the functionality of various units may becombined or integrated into a single module, or further subdivided intofurther sub-modules without departing from the spirit of the invention.

In addition, while the computer system 100 of FIG. 1 shows severaldifferent separate components, the present invention is not limitedthereto. In other embodiments, additional components (such as clientcomputers) may be present, or various components (such as processors)may be integrated, including the entire computer system (such as in alaptop computer).

FIG. 2 shows a simplified 3-D plot of different financial assetsaccording to an embodiment of the present invention.

In FIG. 2, four assets are displayed by the display module 180 on adisplay device. While the exemplary scatter diagrams described inembodiments above may often be laid out accurately on a flat page (e.g.,in 2-D), adding a third dimension (e.g., in 3-D) frequently enhances theaccuracy and allows relationships to be displayed that may not be shownin two dimensions. For example, if four assets A, B, C, and D eachexhibit a pairwise correlation of 0.5, and that 0.5 needs to befaithfully represented by equal distances between each pair of points intwo dimensions (i.e., AB=AC=AD=BC=BD=CD=0.5), it would be impossible.However, such a relationship may be displayed in three dimensions, asshown in FIG. 2.

In the illustrated example of FIG. 2, the distance between any pair ofassets indicates the correlation between those assets. For example, inthe illustrated example where the distance between any pair of assets is0.5, the correlation between any pair of the assets will also be thesame.

FIGS. 3-6 are exemplary screen shots of three dimensional (3-D) scatterdiagrams of financial asset correlations according to an embodiment ofthe present invention. The scatter diagrams may depict correlations ofparticular financial assets over particular periods of time. Forexample, in the example scatter diagrams of FIGS. 3-6, correlations of500 stocks from the Standard & Poor's (S&P) 500 index are measured atvarious times. The particular financial assets in the scatter diagramsmay be identified via identifying labels or well-known symbols, such as,for example, the symbols by which their corresponding shares are tradedand listed. For instance, the symbol IBM may be used to representInternational Business Machines Corporation, while MSFT may be used torepresent Microsoft Corporation.

According to one embodiment, the labels are displayed in the scatterdiagrams with varying sizes where the selected size may reflect, forexample, the corresponding company's market cap (e.g., marketcapitalization, such as the total value of the issued shares). Accordingto one embodiment, larger labels may be used for indicatingproportionally larger market caps of particular assets, such as for AAPL(Apple Incorporated), XOM (Exxon Mobil Corporation), GE (GeneralElectric Company), JNJ (Johnson & Johnson), PFE (Pfizer Incorporated),and T (AT&T Incorporated). In other embodiments, the market cap may berepresented using other visual indicators, such as, for example, color.In still other embodiments, the size of the label may be used to reflectanother measurement, such as volatility (e.g., variation of the price ofthe stock or other financial instrument over time).

According to one embodiment of the invention, the color of each labelmay indicate historical returns for the identified asset. The labels maybe displayed in red, for example, to indicate negative returns, or ingreen to represent positive returns. To better track selected stocks,more distinctively colored labels may be used.

According to one embodiment of the invention, the closer two stocksdepicted in the scatter diagram are (in distance), the more correlatedtheir financial returns are over time (for example, the more their stockvalues rise or fall together, or the more similar their stockfluctuations are with respect to other stocks). According to oneembodiment, the correlation between two stocks over the relevant periodis inversely proportional to the Euclidean distance between the twostocks depicted in the 3-D scatter diagram.

In addition, because the data is displayed in a 3-D format, it may bebetter appreciated on a display device (such as a 3-D display or a 2-Ddisplay with graphics support for 3-D renditions, such as for rotatingand zooming). For instance, the symbols of the scatter diagram may beoriented to face the same direction although not limited thereto. In anexemplary embodiment for display on a 2-D display device, when thedisplayed 3-D image is rotated, the symbols also rotate to create theimpression that the symbols maintain their orientation throughout therotation (which helps highlight the depth of the symbol on the 2-Ddisplay device, that is, the dimension perpendicular to the 2-D displaysurface). The generation of these scatter diagrams may be performed bythe display module 180. In this regard, the display module 180 isconfigured to render points in a multi-dimensional space as a scatterdiagram that visually displays (on a suitable display device)correlation data and other numerical characteristics of thecorresponding financial assets.

In the exemplary scatter diagram of FIG. 3, depicts 12 months ofcorrelation and return data for the S&P 500 are depicted based on returndata collected through December 2002. FIG. 4 is a similar scatterdiagram for April 2012 (almost ten years later). A person of skill inthe art reviewing the scatter diagrams of FIGS. 3 and 4 will understandthat Microsoft (MSFT) stayed relatively close to Oracle (ORCL) in 2002versus 2012 (indicating a high degree of correlation at both periods),while Verizon (VZ) and AT&T (T) moved far away from these two companies(MSFT and ORCL) by 2012 (indicating a loss of correlation in 2012 thatwas not seen in 2002).

According to one embodiment, the 500 symbols in the exemplary scatterdiagrams of FIGS. 3-6 further create a cloud-like image. For example,FIG. 5 depicts the 12 months of correlation and return data for assetsin the S&P 500 from March 2006, while FIG. 6 depicts likewise for March2009 (three years later). As shown in FIG. 5 for the 2006 data, theindividual stocks are fairly spread out creating an image of a sparsecloud, with most stocks performing generally independently of theothers. However, as shown in FIG. 6 for the 2009 data, the stocks aremore tightly bunched creating an image of a dense cloud, with moststocks exhibiting a tighter correlation the others. The 2006 datacorresponds to a healthier time in the economy. By contrast, the 2009data corresponds to a market turndown and the early stages of a recoverytherefrom. Thus, the size of the cloud may be used as an indication ofhow much the individual stocks move as a group during the correspondingperiod being displayed: the smaller the cloud, the more co-movement ofthe group as a whole.

Exemplary Techniques

In an exemplary embodiment, financial return data between differentassets may be accumulated over time by the collection module 150, suchas over a 12-month period according to any mechanism conventional in theart. This financial return data may also be projected by the collectionmodule 150 based on other sources of information (e.g., modeling,simulation, extrapolation, external sources) as will be apparent to aperson of skill in the art. Regardless of the collection method, theresult is a set of data points (in N-dimensional space, for N separateassets), with a value for each asset in each data point.

A common measure of the co-movement of two assets is the correlationbetween their returns. In an exemplary embodiment, the correlationbetween any pair of assets may then be determined by the correlationmodule 160 using any statistical measure of correlation. For ease ofdescription, an assumption is made that this correlation may betransformed into a similarity coefficient (or similarity kernel) K(x,y), which is a nonnegative number representing the similarity between apair of assets x and y using the corresponding values for x and y in theset of data points. The correlation module 160 may compute K(x, y) foreach combination of assets x and y.

In exemplary embodiments of the present invention, K(x, y) has theproperty that the closer the similarity between two assets x and y, thelarger the value of K(x, y). Thus, in one exemplary embodiment, for agiven asset x, the similarity kernel K(x, y) maximizes when comparing xto the same asset, K(x,x), and minimizes (for example, takes on thevalue 0) when comparing x to a completely independent asset y. Inaddition, for ease of description, K(x, y) is assumed to have theproperty that K(x, y)=K(y,x), that is, K(x, y) is symmetric, though theinvention is not limited thereto.

As a non-limiting example of such a similarity kernel K(x, y), in oneembodiment, 1+corr(x, y) is used by the correlation module 160 todetermine the correlation data, where corr(x, y) is the standardcorrelation coefficient. The function corr(x, y) takes on values between−1 and +1, with −1 representing perfect negative linear correlation, 0indicating no linear correlation (such as for independent data), and +1indicating perfect positive linear correlation. The function corr(x, y)is undefined when either of the two variables takes on a constant valueover the entire set of data points. The function 1+corr(x,y) is thus anonnegative number that measures the linear correlation between the twovariables and takes on values between 0 and 2.

Given the similarity kernel K(x, y), in one exemplary embodiment, thecorrelation module 160 computes K(x,y) for all pairs (x,y) of the Nassets under consideration using the corresponding entries in the set ofdata points. Assuming K(x, y)=K(y,x) for any pair (x,y) of assets andthat K(x,x) is fixed (such as 2) for each asset x, the correlationmodule 160 thus defines N(N−1)/2 separate correlation coefficients, onefor each pair (x,y) of distinct assets. In one exemplary embodiment, thecorrelation module 160 numbers the N assets x₁, x₂, . . . , x_(N), andarranges these similarity coefficients in an N×N correlation matrix M,where the ith row represents asset x_(i), the jth column representsasset x_(j), and similarity coefficient M_(ij)=K (x_(i), x_(j)) for alli and j between 1 and N.

For N separate assets, the above techniques process N-dimensionalpoints, which may be difficult to display with two or three dimensionsfor larger values of N, such as N=100 or 500. Accordingly, in anexemplary embodiment, the remapping module 170 remaps the correlationdata produced by the correlation module 160 into a lower dimensionalspace for easier use in displaying the correlation data and in furtherprocessing the correlation data.

In this regard, the row vectors of M may be examined using amathematical theory of random walks (or random walk Markov chains) as iswell known in the art. In order to transform the correlationcoefficients into probability format (for use in random walks), in oneexemplary embodiment, the correlation module 160 divides each row of Mby its corresponding row sum. This operation yields a probabilitytransition matrix P. For a given row i (representing the ith assetx_(i)), the jth entry P_(ij) thus represents the relative correlation ofx_(j) to x_(i), namely by the proportion P_(ij)/P_(ii), subject to therestriction that the row sum

${\sum\limits_{j}P_{ij}} = 1.$

The remapping module 170 considers each entry P_(ij) of P to represent aprobability that a hypothetical trader (or arbitrageur) would exchangeasset x_(i) with asset x_(j) in one time step. In one embodiment, therow vectors of P are used to perform a “random walk” (i.e., a series ofasset exchanges) where the intent may be to keep as similar a portfolioas possible from a risk perspective (such as trying to favor assetexchanges when they appear to exchange assets having similar correlationdata). That is, stocks whose behavior is not similar are not likely tobe exchanged for one another during the “random walk.” The “random walk”is used to define an abstract distance measure d(x, y) between everypair of assets x and y, such that d(x, y) is greater for pairs of assetsx and y which are less likely to be exchanged for one another. Forexample, d(x, y) can be the standard (Euclidean) distance between thecorresponding row vectors of x and y in P.

The remapping module 170 then assigns coordinates to each asset inEuclidean space, with dimension equal to N−1 where N is the number ofassets, in such a way that for each pair of assets x and y, theEuclidean distance between their coordinates is equal to their abstractdistance measure d(x, y), thus providing a concrete geometricalrealization of the abstract distance measure d(x, y). Thus, in oneexemplary embodiment, when considering asset similarity to be a randomwalk, the remapping module 170 uses diffusion maps to perform theremapping of the correlation data generated by the correlation module160.

Upon building and displaying a scatter diagram of the N points in an(N−1)-dimensional space based on the transition matrix P and upon a uservisually inspecting the scatter diagram, the user may identify in arelatively straightforward manner the similarity (or dissimilarity)between two displayed assets. For example, the user may easily discernthe similarity of their corresponding individual correlations with eachof the other assets, as measured by K, based on how far apart theircorresponding points appear in the scatter diagram.

However, (N−1)-dimensional space is generally hard to visualize beyond asmaller number of dimensions (such as two or three dimensions). Even ifother representations (for example, size or color) were to be used torepresent other dimensions, such an approach may not display more thanfive dimensions comfortably in terms of being able to visibly discernwhich assets are truly close to each other and which are far apart. Fivedimensions is significantly smaller than, for example, the 100dimensions needed to display the S&P 100 stocks (or the 500 dimensionsneeded to display the S&P 500 stocks).

Accordingly, embodiments of the present invention provide for atechnique to display close approximations of these (N−1)-dimensionaldistances between these row vectors of P in a much lower dimensionalspace (such as 2-D or 3-D), which is considerably simpler to visualize.One such technique is to use diffusion maps, which is described ingreater detail with reference to FIG. 8. In one exemplary embodiment,the remapping module 170 uses a diffusion map to remap the probabilitytransition matrix (output by the correlation module 160) into alow-dimensional Euclidean space that closely preserves the distances (asmeasured by the correlation module 160) between the corresponding rowvectors of the probability transition matrix.

The diffusion map may provide a way to visualize all the correlationstogether. In one exemplary embodiment, the remapping module 170 uses thediffusion map to translate the correlation matrix of a set of assets (asconverted into a probability transition matrix by the correlation module160) into a scatter diagram in a Euclidean space (such as a 3-D space),such that distances between points in the scatter diagram correspond tocorrelations between their associated assets. For example, the closertwo assets are in the diagram, the higher their correlation.

FIG. 7 is a flow diagram of a process for creating and displaying ascatter diagram for numerous financial assets according to an embodimentof the present invention. The process may be described in terms of asoftware routine implemented, for example, by the correlation module160. A person of skill in the art should recognize, however, that theprocess may be implemented via hardware, firmware (e.g. via an ASIC) orany combination of software, firmware, and/or hardware. Furthermore, thesequence of steps of the process is not fixed, but can be altered intoany desired sequence as recognized by a person of skill in the art.

While the process in the exemplary embodiment of FIG. 7 is implementedon a matrix M of correlations for numerous financial assets, the term“financial assets” may encompass a wide range of financial entities,such as financial instruments, portfolios, indices, or asset classes. Inaddition, the term “correlations” may refer to any statistical measureof co-movement for these financial assets.

Processing begins, and in step 710, the correlation module 160 generatesor otherwise obtains a matrix M of correlations for various financialassets. For example, the matrix M may be calculated from financialreturn data over time (as may be collected by the collection module150), or it may be supplied from an external source.

In step 720, the matrix M of correlations or other co-movement measuresis converted into a probability transition matrix P by the correlationmodule 160. For example, the correlations matrix M may be converted intoa probability transition matrix P by multiplying by the inverse of thecorresponding degree matrix D of M.

In step 730, the correlation module 160 uses the probability transitionmatrix P to define a corresponding abstract distance measurement betweenany two of the financial assets for which correlations were obtained.For example, the distance between any two financial assets may bedefined to be the standard (Euclidean) distance between theircorresponding row vectors in P.

In step 740, the remapping module 170 assigns coordinates in a Euclideanspace to each of the financial assets via an assignment ψ in such a waythat the distance between any two financial assets in the Euclideanspace closely corresponds to their corresponding abstract distancemeasurement defined in step 730. For example, the remapping module 170may use a diffusion map to build P, such that the abstract distancedefined in step 730 is preserved (or nearly preserved) in the Euclideanspace defined in step 740.

In step 750, the display module 180 displays the financial assets on adisplay device using the more significant dimensions of the Euclideanspace. For example, the display module 180 may use the three mostsignificant dimensions of the Euclidean space to build a 3-D scatterplot of the financial assets, such as, for example, the scatter diagramsof FIGS. 3-6. The display module 180 may then display the 3-D scatterdiagram on a 3-D display or on a 2-D display that supports displaying3-D images.

FIG. 8 is a more detailed flow diagram of a process for creating anddisplaying a scatter diagram for N financial assets x₁, x₂, . . . ,x_(N) according to an embodiment of the present invention. The processmay be described in terms of a software routine implemented, forexample, by the correlation module 160. A person of skill in the artshould recognize, however, that the process may be implemented viahardware, firmware (e.g. via an ASIC) or any combination of software,firmware, and/or hardware. Furthermore, the sequence of steps of theprocess is not fixed, but can be altered into any desired sequence asrecognized by a person of skill in the art.

In step 810, a similarity coefficient K(x,y) is chosen by thecorrelation module 160. For example, K(x,y) may be 1+corr(x,y). In oneexemplary embodiment, the correlation module 160 may choose a similaritykernel based on the type of corresponding financial data that isavailable for the assets (for example, financial data that is collectedor otherwise obtained or generated by the collection module 150).

In step 820, the correlation module 160 creates an N×N correlationmatrix M of the N assets, where M_(ij)=K (x_(i), x_(j)) for all i and jbetween 1 and N.

In step 830, the correlation module 160 divides each row of M by itscorresponding row sum to yield a probability transition matrix P. In oneexemplary embodiment, the correlation module 160 uses the (Euclidean)distance between the ith and jth row vectors (in N-dimensional space) todefine a corresponding similarity value between the corresponding assetsx_(i) and x_(j). In step 840, the remapping module 170 computes theeigenvalues λ₀, λ₁, λ₂, . . . , λ_(N-1) and corresponding eigenvectorsψ₀, ψ₁, ψ₂, . . . , ψ_(N-1) for P. Since P is a probability transitionmatrix, in one exemplary embodiment, the remapping module 170 sorts theeigenvalues such that λ₀=1 and the remaining eigenvalues decay rapidly,with 1=λ₀≥λ₁≥λ₂≥ . . . ≥λ_(N-1). With this in mind, the remapping module170 defines ψ_(i)=(λ_(i1), λ_(i2), . . . , λ_(iN)) for each i between 0and N−1. It should be noted that ψ_(0j)=1/√{square root over (N)} foreach j between 1 and N.

In step 850, according to one exemplary embodiment, the remapping module170 defines a diffusion map ψ=(λ₁ψ₁, λ₂ψ₂, . . . , λ_(N-1)ψ_(N-1)). ψ isbest appreciated as a set of N−1 column vectors λ₁ψ₁, λ₂ψ₂, . . . ,λ_(N-1)ψ_(N-1). While λ₀ψ₀ could be included in ψ for completeness, itscontribution in the corresponding row vector distances is 0 since everyentry is just 1/√{square root over (N)}. For the ith eigenvector ψ_(i)and jth asset x_(j), the remapping module 170 definesψ_(i)(x_(j))=ψ_(ij).

The remapping module 170 also defines Ψ(x_(j))=(λ₁ψ₁(x_(j)),λ₂ψ₂(x_(j)), . . . , λ_(N-1)ψ_(N-1)(x_(j)))=(λ₁ψ_(1j), λ₂ψ_(2j), . . . ,λ_(N-1)ψ_(N-1)ψ_(N-1,j)) to be the corresponding row vector of the jthasset x_(j) under ψ. From diffusion map theory (subject to requirementsof the similarity kernel that are or are nearly exhibited in many ormost that might be considered for evaluating asset correlation), ψpreserves or nearly preserves the distance between corresponding rowvectors that was present in the probability transition matrix P.

Thus, in step 860, the display module 180 displays each asset x_(j) on a3-D display (or on a 2-D display using a 2-D rendition of the 3-D image)by using the three most significant components of the diffusion map,namely the corresponding point (λ₁ψ_(1j), λ₂ψ_(2j), λ₃ψ_(3j)), in ascatter diagram of the different assets, such as, for example, thescatter diagrams of FIGS. 3-6. In one embodiment, the display module 180treats λ₀ψ_(0j) as the least significant component in the diffusion map(since it is constant and thus makes no distance contribution betweenassets), while the contribution from the further components λ₄ψ_(4j),λ₅ψ_(5j), . . . , λ_(N-1)ψ_(N-1,j), diminishes rapidly from thediminishing values of the eigenvalues λ₄, λ₅, . . . , λ_(N-1). Thediminishing eigenvalue property is such that for practical purposes,three dimensions is sufficient to pictorially present the assets x₁, x₂,. . . , x_(N) in a scatter diagram, while five dimensions is sufficientfor most numerical calculation applications (as may be computed usingthe analysis module 190 that processes the diffusion map data).

In other embodiments, the fourth and fifth dimensions may be displayedin other ways, such as the color of the symbol used in the scatterdiagram, the size of the symbol, the intensity of the symbol, theorientation of the symbol, and the like. In still other embodiments, theadditional ways of displaying quantities are used to express differentvalues related to the assets, such as market cap or portfolio weight,stocks of interest, volatility, and the like.

In this regard, the remapping ψ not only allows the display module 180to visually display correlation distance between assets in alow-dimensional space (such as two or three dimensions) on a displaydevice, extending the calculations to five or six dimensions allows theanalysis module 190 to perform accurate numerical analysis of thecorrelation distances using orders of magnitude fewer calculations thanwould be required if, for example, all 100 or 500 dimensions wereconsidered using the probability transition matrix P alone.

While the process of FIG. 8 is defined in terms of a 3-D scatterdiagram, in other embodiments, different numbers of dimensions aredisplayed. For example, in one embodiment, the two most significantcomponents (λ₁ψ_(1j), λ₂ψ_(2j)) of the diffusion map are used toconstruct a corresponding 2-D scatter diagram.

FIG. 9 is a flow diagram of the computations that are performed forgenerating a scatter diagram according to the process of FIG. 8 when theprocess is applied to a small number of assets and using K(x, y)=corr(x,y) according to an embodiment of the present invention.

In the example of FIG. 9, three assets x₁, x₂, and x₃ are chosen todemonstrate the execution of the individual steps of the process of FIG.8. The separately labeled steps 910-960 of FIG. 9 correspond to steps810-860 of method 800.

In FIG. 9, the correlation module 160 defines the similarity kernel K instep 910 for each of the pairs of assets, in this case using K(x,y)=corr(x, y). The correlation module 160 builds the correlation matrixM from this similarity kernel data in step 920.

In step 930, the correlation module 160 converts the correlation matrixM to the probability transition matrix P. The remapping module 170determines the eigenvalues λ₀, λ₁, and λ₂ of the transition matrix Palong with the corresponding eigenvectors ψ₀, ψ₁, and ψ₂ in step 940.From these, the remapping module 170 determines the diffusion map ψ instep 950, from which the display module 180 determines the coordinates(in a 2-D Euclidean space) to display on a display device in step 960.It should be noted that the distances between these 2-D points generatedin step 960 and the corresponding distances between their respective rowvectors in the transition matrix P are either identical or practicallyidentical, which is one of the properties of diffusion maps.

The method of diffusion maps is general enough to handle a wide varietyof similarity kernels. An attractive feature of the method is that it isquite robust to noise: small perturbations in the input financial datado not have large effects on the results. This robustness is helpfulwhen dealing with real world financial data, which often contain ascattering of spurious values.

Other Similarity Kernels

While most of the above embodiments were discussed with reference to anexemplary similarity kernel K(x, y)=1+corr(x, y), the present inventionis not limited thereto. In other embodiments, K(x, y) may represent anysimilarity kernel. For example, in other embodiments, K(x, y) may be:

-   -   The R² kernel: K(x, y) is the R² of a linear least-squares        regression between the periodic returns of x and y. The R²        kernel is closely related to the absolute correlation kernel        |corr(x,y)| discussed below.    -   The angle kernel: Regarding the series of r returns on an asset        as a vector in r-dimensional space, K(x,y) is the angle between        the return vectors of x and y. More precisely, K(x,y) is π/2        minus this angle. That is, a smaller angle means that x and y        are more similar.    -   The absolute correlation kernel: K(x,y)=|corr(x,y)| is given by        the absolute value of the standard correlation coefficient        corr(x,y). This similarity kernel does not take directionality        into account. For example, a stock and a short position in that        stock are regarded as very dissimilar by the correlation kernel        1+corr(x,y), but are as similar as possible using the other        three kernels mentioned thus far.

The R² kernel, the angle kernel, and the absolute correlation kerneloften give qualitatively similar results. However, still other possiblekernels may not produce similar results. For instance, other kernels maycapture different notions of similarity. For example, in anotherexemplary embodiment, K(x,y) may be the distance kernel: K(x, y) isgiven by (a suitable transformation of) the Euclidean distance betweenthe return vectors of x and y. Unlike the above examples, thissimilarity kernel takes asset volatilities (and hence leverage) intoaccount. For example, a stock and a 2× leveraged position in that stockare not regarded as very similar by the distance kernel, but are assimilar as possible using the previous four kernels above.

U.S. Equity Co-Movement Over the Past Decade

To further illustrate a method according to one embodiment of thepresent invention, monthly total return data for the period January 2002to April 2012 is analyzed for the index constituents of the S&P 100 U.S.equity securities. There are considerably more than 100 stocks in thesample, since many stocks moved into or out of the index over the fulltime period.

FIGS. 10-12 show exemplary 2-D and 3-D scatter diagrams of S&P 100return data for the period January 2002 to April 2012 according to anembodiment of the present invention.

These scatter diagrams shows the results of applying the diffusion mapmethod to S&P 100 constituent returns over the full decade of data,using the angle kernel. FIG. 10 shows a 2-D scatter plot (using the twomost significant eigenvalues and their eigenvectors), while FIGS. 11-12show two views of a 3-D scatter plot (using the three most significanteigenvalues and their eigenvectors). The two different views in FIGS.11-12 help bring out some of the spatial structure of the diffusion map.Another way to observe the 3-D characteristics is to use the displaymodule 180 to manipulate the graph on the display device, such as withrotations, panning, or zooming. In one exemplary embodiment, the 3-Dscatter plot may be manipulated live by a user (such as by beingrotated, panned, or zoomed) interacting with the display module 180.

The diffusion map, as displayed by the display module 180 in the scatterdiagrams in FIGS. 10-12, exhibits some interesting properties. The stocksymbols tend to form a cloud, with companies in the same industry oftenbeing clustered together. For example, in FIG. 10, energy companiesappear at the top right, (ConocoPhillips, COP; Apache; APA; OccidentalPetroleum, OXY; etc.); pharmaceuticals and health care at the left(Merck, MRK; Baxter International, BAX; Abbott Laboratories, ABT; etc.);some banks at the bottom (Citigroup, C; Bank of America, BAC; USBancorp, USB; etc.); and PC/server related firms in the middle (Dell,DELL; Microsoft, MSFT; etc.) of the cloud.

In addition, the clusters have different locations relative to thecenter of mass of the cloud. For example, in FIGS. 10-12, the PC/serverrelated firms are near the center of the cloud, while the energycompanies and the banks are at roughly opposite edges. Sometimes, a firmis some distance away from other firms in the same industry. Forexample, Apple (AAPL) and Gilead Sciences (GILD) appear sometimes inFIGS. 10-12 to be separated from others in their industry. Conversely,firms may be close to other firms that are in different industries. Forexample, the home improvement related stocks Lowe's (LOW) and Home Depot(HD) are close to the banks that engage in mortgage lending. Thiscorrelation may make sense given the time period, which encompasses theUS housing boom and bust.

It should be noted that the 2-D representation in FIG. 10 may besometimes misleading. For example, FIG. 10 overstates the degree ofco-movement between Pfizer (PFE) and Amgen (AMGN), which appear moreseparated in FIGS. 11-12 because of the added third eigenvector. Eachnew eigenvector contributes successively less separation (because of thedecaying eigenvalues), so three eigenvectors is frequently sufficient toexhibit the key separations, while very little distance impact isexperienced after considering the first four or five eigenvalues andtheir eigenvectors. According to one exemplary embodiment, the displaymodule 180 displays extra eigenvectors, such as the fourth or fifth mostsignificant eigenvectors, on a display device through use of color orsize of the symbols. According to another exemplary embodiment, thedisplay module 180 may configure the axes to display differentcombinations of the first four or five eigenvectors at different timeson the display device, which may be used for making sure propertiesbeing exhibited in displays of the first two or three eigenvectors arenot significantly affected by the eigenvectors not being displayed.

With the diffusion map as exhibited in FIGS. 10-12, a person of skill inthe art should recognize that an important feature of the diffusion mapis the distances between points, which represent the correlation ofdifferent stocks. This same distance information is preserved even ifthe cloud is shifted or rotated. Further, the relative distances betweendifferent pairs of stocks have significance. That is, the distances mayalso be used to compare pairs of stocks where the distances are large.For example, in FIGS. 10-12, the fact that BAC and DELL are closertogether than BAC and OXY is meaningful (e.g., DELL exhibits moresimilar behavior to BAC than OXY does), even though both of theserelative distances are quite large.

FIG. 13 shows an exemplary 3-D scatter diagram of S&P 500 return dataaccumulated over three different periods according to an embodiment ofthe present invention.

As shown in FIG. 13, the stock symbols may be replaced with otherrepresentations, such as, for example, colored dots, with threediffusion maps overlaid, each of the diffusion maps generated using S&P500 return data from three different multi-year sub-periods using the R²kernel. The most recent period (July 2009 to April 2012) may berepresented in a first color, while July 2007 to June 2009 (the “crisisperiod”) may be represented by a second color, and July 2003 to June2007 (the “credit boom”) may be represented in a third color. The sizeof the symbols may correspond to the average index weight of thecorresponding stock over the relevant period.

FIG. 13 gives an idea how the overall diffusion map cloud may take ondifferent sizes and concentrations over different periods. For example,from FIG. 13, the cloud may be seen as quite scattered during the creditboom, then contracted during the crisis period as stocks began to movemore closely together, and remained more compact in the most recentperiod, particularly taking index weights into account.

While these overlaid diffusion maps in FIG. 13 contain quite a bit ofinformation, this exhibit may be hard to read. It may be useful to havea quantitative measure of the global tendency of assets to movetogether, such as the size or compactness of the cloud as a whole. Itmay also be useful to be able to identify the most significant localconcentrations within the cloud. Through use of the analysis module 190,the diffusion map method may also be applied to those problems.

Measuring Overall Portfolio Diversification

The size of the cloud, or the extent to which it is spread out mayreveal information about diversification across an entire group ofassets, such as how much they all tend to move together, or how globallyconcentrated they are. A single summary measure may be useful. Forexample, it may allow portfolio concentration or diversification to begauged with a single number, which may help with the selection ofcomponent assets in the portfolio. Embodiments of the present inventionmay compute such summary measures using the diffusion map and theanalysis module 190.

FIG. 14 is a flow diagram of a process for generating a single summarymeasure of the diversification of a group of N assets in a portfolioaccording to an embodiment of the present invention. The process may bedescribed in terms of a software routine implemented, for example, bythe analysis module 190. A person of skill in the art should recognize,however, that the process may be implemented via hardware, firmware(e.g. via an ASIC) or any combination of software, firmware, and/orhardware. Furthermore, the sequence of steps of the process is notfixed, but can be altered into any desired sequence as recognized by aperson of skill in the art.

In step 1410, the remapping module 170 generates a diffusion map for thegroup of N assets (using, for example, the process described withrespect to FIG. 8). This diffusion map generation creates a cloud ofpoints in N-dimensional space (whose 2-D and 3-D representations arediscussed above). In step 1420, the analysis module 190 reduces thecloud to some small number of dimensions (for example, five) among themost significant dimensions (using, for example, the most significantnon-trivial eigenvalues and eigenvectors from the diffusion mapgenerated by the remapping module 170). In addition, the analysis module190 weights each point in the diffusion map by its portfolio weight. Forexample, to evaluate an index such as the S&P 500, the weight of aparticular stock corresponds to its weight in the index.

The cloud may have a fairly irregular shape. However, in step 1430, thisproperty may be ignored, and the weighted sample of points may beassumed to have been taken from a multivariate normal distribution. Thisassumption allows the parameters of this multivariate normaldistribution to be determined using standard methods. For example, instep 1440, according to one exemplary embodiment, the analysis module190 determines the estimated covariance matrix from the weighted sampleof points. The covariance matrix describes the “extent” of the cloud:the larger the variances, the bigger the cloud.

In step 1450, the analysis module 190 defines the global concentrationmeasure to be 1/√{square root over ((tr ε))}, where trε is the trace ofε, that is, the sum of the main diagonal entries of ε. Thus, the higherthe global concentration, the smaller the cloud (taking the weights intoaccount). According to one exemplary embodiment, since the analysismodule 190 carries out this global concentration measure computationusing only the first five coordinates in the diffusion map, significantamounts of computation are saved compared to using all N coordinates(for example, when N=500). It should be noted that the scale of theglobal concentration measure may depend on parameters such as the samplepopulation of assets, the length of the return series, the number ofcoordinates used in the calculation, etc. Accordingly, in one exemplaryembodiment, these parameters are held constant.

Using example techniques such as method 1400, the analysis module 190may quickly and succinctly measure the overall portfolio concentrationand diversification. These portfolio concentration and diversificationmeasurements may allow those using the measurements to adjust assets ortheir corresponding weights in the portfolio to achieve a more desiredconcentration or diversification. According to one embodiment, based onthe global concentration measurement, one or more assets may berecommended to a user for including and/or excluding from the currentportfolio.

In one exemplary embodiment, the global concentration measure for aknown index, such as the S&P 500, is compared to that of an activelymanaged portfolio containing constituents from the same index, such asfrom the S&P 500 index constituents. In this embodiment, and using theS&P 500 as the example index, the analysis module 190 computes theglobal concentration measure for the S&P 500 using the S&P 500 indexweights (using, for example, the method of FIG. 14). In addition, theanalysis module 190 also computes the global concentration measure forthe portfolio of interest using the diffusion map coordinates computedfor the S&P 500 index constituents, but using portfolio weights ratherthan the index weights. Then the analysis module 190 can output theratio between the portfolio global concentration measure and the S&P 500index global concentration measure as a relative global concentrationmeasure. The relative global concentration measure provides a way tocompare the diversification of the portfolio of interest to that of theS&P 500. In one exemplary embodiment, this process is extended toportfolios with holdings outside of the S&P 500.

Local Concentrations within a Portfolio

As well as assessing global concentration, i.e., how diversified thewhole group is, local concentration can also be analyzed. For example,suppose assets can be subject to localized or idiosyncratic shocks thataffect only specific regions of the abstract space of assets. It wouldbe beneficial to know which of these shocks are the most important(e.g., have the greatest potential impact). In other words, in whichregions are the assets most concentrated? Another way to phrase this is,where are the local concentrations within a portfolio?

To help make this idea more precise, consider a functional form for a“local shock function” that describes such a localized shock. It isconvenient to take a symmetrical normal density function, rescaled tohave unit maximum—this describes a shock that has a smooth peak at asingle point in space and decays fairly rapidly after that. The extentof the shock can be specified by the scale parameter c in the normaldensity function: choosing a larger value of c corresponds to focusingon shocks that affect assets in a wider local region, (such as “lessidiosyncratic” shocks). It should be noted that it only makes sense togive this spatial definition of a “local shock” because a geometricalrepresentation of the correlation matrix was defined in terms of aEuclidean space.

According to one embodiment, the weighted points in the cloud may beregarded as collectively describing a discrete measure on Euclideanspace. For a given shock location, the integral of the local shockfunction with respect to this measure describes the total impact of theshock on the whole group of assets. It should be noted that thisintegral is really just a finite sum, and the contribution to the sumcomes mainly from stocks closer to the center of the shock, and withhigher market caps. The (largest ε-) local concentration is given by thelocation that maximizes the value of this integral. According to oneembodiment, only the first several coordinates of the diffusion map (forexample, five) are used in this calculation, assuming that thesecoordinates capture most of the relevant information.

FIG. 15 shows an exemplary 3-D scatter diagram of S&P 500 return datahighlighting five local concentrations according to an embodiment of thepresent invention. FIG. 16 shows an exemplary 3-D scatter diagram of S&P500 return data zooming in on the largest local concentration in FIG. 15according to an embodiment of the present invention.

FIG. 15 shows the diffusion map for the S&P 500 index, using the R²kernel and monthly total returns during the post-crisis (2009-2012)period. The size of each symbol corresponds to the 2012 market cap ofthe stock. The figure also shows, in gray, the five largest localconcentrations identified as above, using the scale parameter ε=0.005,which corresponds to a shock extending over about the width of two gridcubes in the diagram.

As can be seen in FIG. 15, the largest local concentration is near thecenter of the cloud, where the shock can pick up quite a few stocks,including some with fairly large market caps such as XOM and IBM. Thesecond largest local concentration is very close to AAPL and Google,GOOG. Another important local concentration is near Procter & Gamble,PG.

FIG. 16 zooms into the largest local concentration to give a closer lookat the stocks affected. In FIG. 16, the intensity of the ticker symbolsindicate the local shock function of the corresponding asset, namely theproximity of the corresponding asset to the center of the shock, withdarker shading (such as for XOM) indicating the closest to the center,and lighter shading indicating progressively further from the center(and hence, less affected by the shock). It is apparent that this localshock affects a number of stocks from different sectors that have had atendency to move together during the recent period.

One can now iterate the calculation of local concentration as follows:

(1) Specify a scale parameter c.

(2) Find the largest local concentration, as above.

(3) “Subtract the local shock from the weights”; i.e., for each stock,multiply the market cap by (1—value of local shock function at thatstock).

(4) Return to step 2, and repeat as many times as desired.

The result of this process is the (ε-)local concentration profile, whichshows the size (and location) of shocks, of decreasing importance, thatcan affect the cloud. The local concentration profile containsinformation on the concentration/diversification of the group of assetsover and above the information in the global concentration measure.

The choice of scale parameter ε is important. The local concentrationprofile depends on the choice of the scale parameter E, and differentchoices of ε may reveal different aspects of the data. If ε is toolarge, it may pick up too many stocks in its corresponding local shock(for instance, practically the whole cloud may become a shock). This mayreduce the local concentration profile to a list of stocks weightedjointly by their proximity to the center of the cloud and to theirportfolio weight (or market cap). On the other hand, if ε is too small,it may only pick up only one (or essentially one) stock at a time.Accordingly, the local concentration profile may simply be the list ofindividual stocks in descending order of portfolio weight. Using scatterdiagrams (such as those in FIGS. 15-16) according to embodiments of thepresent invention, one of ordinary skill in the art can find appropriatevalues of the scale parameter ε for a particular set of assets todetermine useful local concentration profiles without undueexperimentation.

As with the global concentration, and using the S&P 500 as an exemplary(and non-limiting) index, the local concentration profile according toembodiments of the present invention can be extended to actively managedportfolios that only hold S&P 500 index constituents. In one exemplaryembodiment, the signed measure defined by the portfolio relative weightsis used in place of the S&P 500 weights, and a similar maximizationprocedure (as that described above) to determine the (signed) relativelocal concentrations of the portfolio versus the S&P 500 index.

Using Embodiments of the Invention to Help Make Investment Decisions

Through use of the visualization and analysis tools of embodiments ofthe present invention, an investor or financial advisor may gain abetter understanding of how correlations have changed over time, whichmay help the investor form a qualitative judgment about how thecorrelations may continue in the investment horizon. Investmentdecisions may be made based on such correlations.

According to one embodiment, the scatter diagrams and associatedcorrelation data may be used for diversification of a portfolio toreduce risk. Diversification may not necessarily be best achieved byowning a large number of companies, or companies across a broad range ofindustries. It may be preferable to own a smaller number of positions,but make sure they are spread out across different places in the cloud,far apart from each other.

According to one embodiment of the invention, the display module 180 maybe configured to display a scatter diagram depicting correlationsbetween assets, and further configured to receive a user selection ofone of the assets in the scatter diagram. In response to the selectionof the asset, the display module 180 may be configured to identify asecond asset with a maximum distance from the selected asset, or otherdesired position relative to the selected asset. The second asset may behighlighted in the scatter diagram, information about the second assetretrieved from the storage device 130, and/or the second assetrecommended to the user as an asset that has the furthest correlation tothe selected asset. In response to such recommendation, the user mayselect the second asset and input the second asset into a softwareapplication configured to generate a portfolio based on selected assets.

According to one embodiment, existing funds (such as mutual funds) areanalyzed to see how their constituent assets measure up to anestablished index, such as the S&P 500. This can be done at the globalconcentration level and at the local concentration level, as describedabove. For instance, to modify the fund to behave closer to the S&P 500,its constituent assets can be adjusted to produce a similar globalconcentration and a similar local concentration profile as that of theS&P 500.

According to one embodiment, existing funds are analyzed to see howdiverse or concentrated their holdings are. Here, the goal may be toincrease diversification by selecting or adjusting constituent assets tolower the global concentration as well as identify and mitigate anylarge local concentrations, as measured by the exemplary techniquesdescribed above.

According to one embodiment, existing funds are analyzed using the abovetechniques to see how diverse they are from each other. Here, the goalmay be to consolidate funds that have similar behavior, to maintain thediversity of funds with different behavior (by adjusting constituentassets that maintains this diversity), or to assist with selection ofconstituent funds for a fund-of-funds investment vehicle.

FIG. 17 shows an exemplary 3-D scatter diagram of individual sectors ofthe U.S. bond market according to an embodiment of the presentinvention.

In FIG. 17, individual sectors of the U.S. bond market are displayed asthe collection of financial assets. Here, the label size for each of thedifferent bond classifications, such as Financial, Corporate,Industrial, or MBS (mortgage-backed securities), in used to indicate therelative volatility of the particular bond. As with the display ofstocks in some of the earlier figures, in FIG. 17, the distance betweenthe different labels is inversely proportional to the correlation of thecorresponding bond sectors. For example, the five central bond sectorsin FIG. 17—namely, Industrial, Corporate, Credit, CMBS (commercial MBS),and Financial—are close together, showing strong correlation betweenthese sectors, while other sectors, such as MBS, ABS (asset-backedsecurities), and HY (high yield) are spread far apart, indicating lesscorrelation between these sectors.

According to one embodiment, diagrams like the one shown in FIG. 17 areused to help assess how the individual bond sectors move relative to,for example, treasury securities. This can be useful to investors, forexample, in deciding how much to allocate to different bond sectors,such as government bonds, corporate bonds, or mortgages.

According to one embodiment, a user invokes the one or more modules ofthe computer device for comparing a scatter diagram for an input set offinancial instruments, portfolios, indices, or asset classes, to thescatter diagram for an existing set of financial instruments,portfolios, indices, or asset classes. In this regard, the user mayinvoke a graphical user interface provided by one or more modules of thecomputer device to select the input set (e.g. by selecting one or moreidentifiers of the input set), and further invoke the graphical userinterface to select the existing set to be compared against (e.g. byselecting one or more identifiers of the existing set).

The user may further submit a command to generate a comparison of thescatter diagrams of the two sets. In response to such a command, one ormore modules of the computer device may be configured to display thescatter diagram of the existing set via a particular visual depiction,and further display the scatter diagram of the input set according to adifferent visual depiction. The two scatter diagrams may be overlaid ontop of each other. In this regard, a viewer of the two scatter diagramsmay understand, at a glance, how the correlations in the input set trackwith the correlations of the existing set. Based on this understandingfrom the display, the user may invoke the graphical user interface toadd and/or delete an asset to and/or from the input set, or to adjust anattribute (e.g. investment amount) with respect to an asset in the inputset. In response to the user command, the computer device adds, deletes,and/or modifies the attribute as indicated by the user.

While the present invention has been described in connection withcertain exemplary embodiments, it is to be understood that the inventionis not limited to the disclosed embodiments, but, on the contrary, isintended to cover various modifications and equivalent arrangementsincluded within the spirit and scope of the appended claims, andequivalents thereof.

1. A method for displaying a matrix of correlations of a plurality offinancial instruments, portfolios, indices, or asset classes, the methodcomprising: identifying by a processor the matrix of correlations;converting by the processor the matrix of correlations into aprobability transition matrix; defining by the processor a correspondingabstract distance measurement between any two of the financialinstruments, portfolios, indices, or asset classes based on theprobability transition matrix; assigning by the processor coordinates ina Euclidean space to each of the financial instruments, portfolios,indices, or asset classes corresponding to non-unit eigenvalues of theprobability transition matrix, wherein a Euclidean distance between saidany two of the financial instruments, portfolios, indices, or assetclasses in the Euclidean space closely approximates the correspondingabstract distance measurement; and displaying on a display device thefinancial instruments, portfolios, indices, or asset classes based onparticular dimensions of the Euclidean space corresponding to largerones of the eigenvalues.
 2. The method of claim 1, wherein each of thecorrelations is derived from the standard correlation coefficient of acorresponding pair of the financial instruments, portfolios, indices, orasset classes.
 3. The method of claim 2, wherein each of thecorrelations is one more than the standard correlation coefficient ofthe corresponding pair of the financial instruments, portfolios,indices, or asset classes.
 4. The method of claim 1, wherein a number ofthe particular dimensions is three.
 5. The method of claim 4, whereinthe particular dimensions comprise the dimensions of the Euclidean spacecorresponding to the three largest ones of the eigenvalues.
 6. Themethod of claim 4, wherein the displaying of the financial instruments,portfolios, indices, or asset classes comprises displaying anidentifying label for each of the financial instruments, portfolios,indices, or asset classes in a 3-dimensional Euclidean representation onthe display device.
 7. The method of claim 6 further comprisingmodifying by the processor the 3-dimensional Euclidean representation onthe display device in response to a user command.
 8. The method of claim6 further comprising displaying by the processor on the display devicesuccessive representations of correlation data as observed on successivedates.
 9. The method of claim 6 further comprising adjusting by theprocessor a color or size of the identifying label to correspond to arespective value of an additional numerical characteristic beingdisplayed in the 3-dimensional Euclidean representation on the displaydevice for each of the financial instruments, portfolios, indices, orasset classes.
 10. The method of claim 1, wherein a number of theparticular dimensions is two.
 11. The method of claim 10, wherein theparticular dimensions comprise the dimensions of the Euclidean spacecorresponding to the two largest ones of the eigenvalues.
 12. The methodof claim 10, wherein the displaying of the financial instruments,portfolios, indices, or asset classes comprises displaying anidentifying label for each of the financial instruments, portfolios,indices, or asset classes in a 2-dimensional Euclidean representation onthe display device.
 13. The method of claim 1 further comprisinggenerating by the processor a measure of diversification of thefinancial instruments, portfolios, indices, or asset classes.
 14. Themethod of claim 13, wherein the generating of the measure ofdiversification of the financial instruments, portfolios, indices, orasset classes comprises generating the measure of diversification usingthe particular dimensions of the Euclidean space.
 15. The method ofclaim 13, wherein the measure of diversification comprises a globalconcentration, a relative global concentration, or a largest localconcentration.
 16. The method of claim 15, wherein the measure ofdiversification comprises a global concentration; the generating of theglobal concentration comprises: assigning by the processor a weight toeach of the financial instruments, portfolios, indices, or assetclasses; and weighting by the processor a contribution of each of thefinancial instruments, portfolios, indices, or asset classes by itsrespective said weight in the global concentration.
 17. The method ofclaim 16 further comprising generating by the processor a portfoliodiversification measure by: identifying by the processor ones of thefinancial instruments, portfolios, indices, or asset classes; assigningby the processor second weights to respective said ones of the financialinstruments, portfolios, indices, or asset classes; and generating bythe processor the global concentration by only using the ones of thefinancial instruments, portfolios, indices, or asset classes in place ofeach of the financial instruments, portfolios, indices, or assetclasses, and using the second weights in place of the weight of each ofthe financial instruments, portfolios, indices, or asset classes. 18.The method of claim 1 further comprising generating by the processor asequence of successively less significant local concentrations of thefinancial instruments, portfolios, indices, or asset classes.
 19. Themethod of claim 1 further comprising generating by the processor aplurality of relative local concentrations of the Euclidean space. 20.The method of claim 1 further comprising generating by the processor anumerical summary measure of accuracy with which the Euclidean distanceas measured in the particular dimensions of the Euclidean spaceapproximates the corresponding abstract distance measurement. 21-30.(canceled)